Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Learning on the fly: a font-free approach toward multilingual OCR

Identifieur interne : 000558 ( Main/Exploration ); précédent : 000557; suivant : 000559

Learning on the fly: a font-free approach toward multilingual OCR

Auteurs : Andrew Kae [États-Unis] ; David A. Smith [États-Unis] ; Erik Learned-Miller [États-Unis]

Source :

RBID : Pascal:12-0083148

Descripteurs français

English descriptors

Abstract

Despite ubiquitous claims that optical character recognition (OCR) is a "solved problem," many categories of documents continue to break modern OCR software such as documents with moderate degradation or unusual fonts. Many approaches rely on pre-computed or stored character models, but these are vulnerable to cases when the font of a particular document was not part of the training set or when there is so much noise in a document that the font model becomes weak. To address these difficult cases, we present a form of iterative contextual modeling that learns character models directly from the document it is trying to recognize. We use these learned models both to segment the characters and to recognize them in an incremental, iterative process. We present results comparable with those of a commercial OCR system on a subset of characters from a difficult test document in both English and Greek.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Learning on the fly: a font-free approach toward multilingual OCR</title>
<author>
<name sortKey="Kae, Andrew" sort="Kae, Andrew" uniqKey="Kae A" first="Andrew" last="Kae">Andrew Kae</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Department of Computer Science, University of Massachusetts Amherst, 140 Governors Drive</s1>
<s2>Amherst, MA 01003-9264</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Massachusetts</region>
<settlement type="city">Amherst (Massachusetts)</settlement>
</placeName>
<orgName type="university">Université du Massachusetts à Amherst</orgName>
</affiliation>
</author>
<author>
<name sortKey="Smith, David A" sort="Smith, David A" uniqKey="Smith D" first="David A." last="Smith">David A. Smith</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Department of Computer Science, University of Massachusetts Amherst, 140 Governors Drive</s1>
<s2>Amherst, MA 01003-9264</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Massachusetts</region>
<settlement type="city">Amherst (Massachusetts)</settlement>
</placeName>
<orgName type="university">Université du Massachusetts à Amherst</orgName>
</affiliation>
</author>
<author>
<name sortKey="Learned Miller, Erik" sort="Learned Miller, Erik" uniqKey="Learned Miller E" first="Erik" last="Learned-Miller">Erik Learned-Miller</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Department of Computer Science, University of Massachusetts Amherst, 140 Governors Drive</s1>
<s2>Amherst, MA 01003-9264</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Massachusetts</region>
<settlement type="city">Amherst (Massachusetts)</settlement>
</placeName>
<orgName type="university">Université du Massachusetts à Amherst</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">12-0083148</idno>
<date when="2011">2011</date>
<idno type="stanalyst">PASCAL 12-0083148 INIST</idno>
<idno type="RBID">Pascal:12-0083148</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000104</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000668</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000112</idno>
<idno type="wicri:doubleKey">1433-2833:2011:Kae A:learning:on:the</idno>
<idno type="wicri:Area/Main/Merge">000564</idno>
<idno type="wicri:Area/Main/Curation">000558</idno>
<idno type="wicri:Area/Main/Exploration">000558</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Learning on the fly: a font-free approach toward multilingual OCR</title>
<author>
<name sortKey="Kae, Andrew" sort="Kae, Andrew" uniqKey="Kae A" first="Andrew" last="Kae">Andrew Kae</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Department of Computer Science, University of Massachusetts Amherst, 140 Governors Drive</s1>
<s2>Amherst, MA 01003-9264</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Massachusetts</region>
<settlement type="city">Amherst (Massachusetts)</settlement>
</placeName>
<orgName type="university">Université du Massachusetts à Amherst</orgName>
</affiliation>
</author>
<author>
<name sortKey="Smith, David A" sort="Smith, David A" uniqKey="Smith D" first="David A." last="Smith">David A. Smith</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Department of Computer Science, University of Massachusetts Amherst, 140 Governors Drive</s1>
<s2>Amherst, MA 01003-9264</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Massachusetts</region>
<settlement type="city">Amherst (Massachusetts)</settlement>
</placeName>
<orgName type="university">Université du Massachusetts à Amherst</orgName>
</affiliation>
</author>
<author>
<name sortKey="Learned Miller, Erik" sort="Learned Miller, Erik" uniqKey="Learned Miller E" first="Erik" last="Learned-Miller">Erik Learned-Miller</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Department of Computer Science, University of Massachusetts Amherst, 140 Governors Drive</s1>
<s2>Amherst, MA 01003-9264</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Massachusetts</region>
<settlement type="city">Amherst (Massachusetts)</settlement>
</placeName>
<orgName type="university">Université du Massachusetts à Amherst</orgName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint>
<date when="2011">2011</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Character recognition</term>
<term>Document structure</term>
<term>Greek</term>
<term>Image processing</term>
<term>Iterative method</term>
<term>Iterative process</term>
<term>Modeling</term>
<term>Multilingualism</term>
<term>On the fly</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance caractère</term>
<term>Reconnaissance optique caractère</term>
<term>Reconnaissance forme</term>
<term>Traitement image</term>
<term>A la volée</term>
<term>Multilinguisme</term>
<term>Structure document</term>
<term>Processus itératif</term>
<term>Grec</term>
<term>Modélisation</term>
<term>Méthode itérative</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Multilinguisme</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Despite ubiquitous claims that optical character recognition (OCR) is a "solved problem," many categories of documents continue to break modern OCR software such as documents with moderate degradation or unusual fonts. Many approaches rely on pre-computed or stored character models, but these are vulnerable to cases when the font of a particular document was not part of the training set or when there is so much noise in a document that the font model becomes weak. To address these difficult cases, we present a form of iterative contextual modeling that learns character models directly from the document it is trying to recognize. We use these learned models both to segment the characters and to recognize them in an incremental, iterative process. We present results comparable with those of a commercial OCR system on a subset of characters from a difficult test document in both English and Greek.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>Massachusetts</li>
</region>
<settlement>
<li>Amherst (Massachusetts)</li>
</settlement>
<orgName>
<li>Université du Massachusetts à Amherst</li>
</orgName>
</list>
<tree>
<country name="États-Unis">
<region name="Massachusetts">
<name sortKey="Kae, Andrew" sort="Kae, Andrew" uniqKey="Kae A" first="Andrew" last="Kae">Andrew Kae</name>
</region>
<name sortKey="Learned Miller, Erik" sort="Learned Miller, Erik" uniqKey="Learned Miller E" first="Erik" last="Learned-Miller">Erik Learned-Miller</name>
<name sortKey="Smith, David A" sort="Smith, David A" uniqKey="Smith D" first="David A." last="Smith">David A. Smith</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000558 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000558 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:12-0083148
   |texte=   Learning on the fly: a font-free approach toward multilingual OCR
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024